Improve logic to cleanup server-session exactly once#6408
Improve logic to cleanup server-session exactly once#6408shinrich wants to merge 1 commit intoapache:masterfrom
Conversation
| tunnel.deallocate_buffers(); | ||
| tunnel.reset(); | ||
| if (server_entry) { | ||
| server_entry->in_tunnel = false; |
There was a problem hiding this comment.
Is this necessary? It looks like this is done in line 5421.
| SMDebug("http_tunnel", "send 408 response to client to vc %p, tunnel vc %p", ua_txn->get_netvc(), p->vc); | ||
|
|
||
| tunnel.chain_abort_all(p); | ||
| server_session = nullptr; |
There was a problem hiding this comment.
IIUC, the crash will be fixed by setting server_entry->in_tunnel = false here.
I'm a bit afraid that this PR is too generic for fixing a crash.
|
Unfortunately, I still get the crash with this patch. Have you tried this with |
|
So far, our 9.0.x plus PR #6407 and PR #6404 is not crashing. Our crashes were intermittent, so I'll keep an eye on things today. Unfortunately, PR #6407 and #6404 do not play together well. With the fix for #6407, the read_complete has already been sent for the post data so the HttpSM handler is state_watch_for_client_abort when the second trailer READ_COMPLETE is sent. This causes the transaction to fail and the nghttp request is responded with a 502 error. In our case, the faulty clients are not sending trailing headers, so just fixing it in the trailing header logic is not sufficient. |
Do you mean #6407 and #6408 ? The reason why I asked trying nghttp with trailers is that this patch should fix the crash without #6407 nor #6404 changes. For trailers, we should not signal READ_COMPLETE on final DATA frame. I can do that on #6404, after #6407 is merged. |
|
I talked with @bryancall today, and since the according to the spec we should really be returning a a HTTP/2 error to the client if they are not sending a data frame EOS. I didn't get time today, but I would like to spend some more time tomorrow correctly identifying and cleaning up this case. While PR #6408 avoids the crash, it does not return the error to the client. |
|
I think we can take this one out of the mix |
|
Remember to clear Milestone / Project when closing a PR without merging :). |
Breaking apart the PR in #6401. This one includes the logic to better ensure that the server_session is closed and deleted in a timely manner. The crash we were seeing was due to a server_session staying around after the HttpSM had been deleted. The inactivity cop would execute the handleEvent on the deleted HttpSM continuation.